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ABSTRACT 



This study investigated the opinions of college faculty and 
administrators regarding the purpose, control, and process of performance 
evaluation, hypothesizing that job orientations and expectations would 
influence their opinions- -that administrators would favor an economic model 
emphasizing authoritative and quantitative measures; teachers would favor an 
information model emphasizing networking relationships; and researchers would 
favor a hybrid approach. Questionnaires sent to three Canadian universities 
and completed by administrators, professors, instructors, and researchers 
yielded 116 usable replies. The questionnaire's 54 items focused on: purposes 
of performance evaluation; control and process of performance evaluation; 
standards; validity of performance indicators; overall opinions on the issues 
of purposes, control, and process, as well as satisfaction with existing 
performance evaluation systems; and demographics. Results indicated that job 
orientation and expectations of respondents influenced their views on 
purposes, control sources, and implementation procedures of performance 
evaluation. Administrators favored an economic model; teachers favored an 
information model; and researchers favored a hybrid approach. Respondents 
believed evaluation should be annual for nontenured faculty and every two to 
three years for tenured faculty. There was substantial agreement among 
respondents about the appropriate list of performance indicators. Data tables 
and diagrams of rank-ordered means are appended. (Contains 19 references.) 
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Abstract 



This is an empirical investigation on the differential effects of the Economic and Information 
models in identifying purposes, control sources and implementation procedures of a 
performance evaluation system in higher education. The t test of means difference, one-way 
ANOVA F and structural groupings in factor analysis reveal that, due to their varied job 
orientations and expectations, administrators tend to associate with the characteristics of the 
Economic model, teachers with the Information model and researchers, with a mixture of 
both. Nevertheless, there was a substantial agreement among those surveyed about the 
appropriate list of performance indicators based on the ranking of sample means. 
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Opinions of Administrators and Faculty on the Purposes, Control and Process of 
Performance Indicators in Higher Education: A Pilot Study 



In recent years, there has been increasing concern and pressure arising from all levels 
of society for quality assessment and public accountability in higher education. Central to the 
quality assessment of higher education is the performance evaluation of faculty. Academic 
institutions (universities, faculties, departments, etc.) may have diverse values reflecting 
differing conceptions of higher education and the knowledge of these values is essential in 
determining the form and function of performance evaluation procedures. Naturally, different 
faculties and departments within any given institution may employ various forms of 
evaluation. The conflict in these dimensions has been demonstrated in the European 
experience over the last decade (Cave, Hanney, Henkel, & Kogan, 1997). In this study, we 
introduce two other research aspects, namely, the differential views on demand characteristics 
of the academic activities held by administrators and faculty; and, different job orientations 
and expectations (research, teaching, service, etc.) among the faculty, even in the same 
faculties or departments. These differences may require varied purposes and forms of 
evaluation. No empirical evidence on these issues has been reported in the North American 
context although it is well known that goals and procedures in faculty evaluation are diverse. 
Therefore, performance evaluation may become an important source of apprehension and 
misconception in academic settings (Newson, 1995). It is the primary objective of this study 
to conduct a survey of opinions held by faculty and administrators in a sample of three 
universities concerning the purpose, control and process of performance evaluation. 




3 



Theoretical Framework and Hypotheses 



Performance Indicators (Pis) are defined by Darling-Hammond (1992) as "individual 
or composite statistics that reflect important features of a system, such as education, health, or 
the economy" (p. 236). More specifically, Pis are often defined as quantitative measures of 
some important aspects of university operations relative to institutional goals (Borden & 
Bottrill, 1994; Cave, Hanney & Kagan, 1991; Kells, 1990). Unfortunately, as Benjamin 
(1996) pointed out, this definition is problematic in at least four aspects: (i) it mistakenly 
implies that quantitative measures are better than qualitative ones, (ii) it may ignore the 
different needs of administrators and faculty in the implementation of performance evaluation, 
(iii) it tends to sacrifice the specific and itemized nature of Pis in protecting the generality of 
institutional goals, and (iv) it could erroneously treat all Pis as equally useful. In short, the 
definition and construction of Pis in any educational setting should be more empirically 
oriented than theoretically driven. This is the approach we follow here. In particular, we will 
empirically examine the relevance of two models in developing the performance indicators 
(Pis) in higher eduction: the Economic model (proposed by Cave and his colleagues) and the 
Information, or Communicative Action, model (proposed by Habermas, 1989, 1991 and 
reintroduced by Barnett, 1994). 

In the Economic model , the conception of performance indicators (Pis) derives from 
the fact that the educational system can be considered as a process within a wider economic 
system which converts inputs (such as time spent on duties, faculty salaries, etc.) into output 
(such as number of graduates, research publication, etc.). In this model, Pis in higher 
education may be described as authoritative and quantitative measures of attributes of the 
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activities of institutions and component units (Cave et al., 1991). They entail the collection of 
data at different levels of aggregation to aid forming judgments on faculty performance- 
judgments which may be made either within departments, within institutions or at the level of 
higher education system as a whole. Concrete steps in the Economic model can be specified 
either by the Production approach (Nedwek & Neal, 1994) or the Student Development 
approach (Pascarella & Terenzini, 1991). In the Production system, universities are considered 
as manufacturing plants that transform entering students into graduates. Variants of the 
production approach are the only ones in current use in Europe and North America 
(Benjamin, 1996; Borden & Bottrill, 1994; Gaither, Nedwek & Neal, 1994; Nedwek & Neal, 
1994). Under the Student Development approach, Pis are designed to facilitate the attainment 
of "whole students" who not only know course and program contents (cognitive indicators) 
but are also integrated into academic and social communities (maturation indicators). In short, 
all variants of the Economic model would lead to an outcome-based performance evaluation 
system. 

In the Information model , performance evaluation in higher education is quality- 
oriented, serving different ends ("Purpose"), being conducted by separate parties ("Control") 
and employing different techniques ("Process"). It emphasizes a networking relationship that 
takes into consideration the needs, as well as the contribution, of all participants in the 
system. Barnett (1994) explained this model in terms of three conceptual axes: enlightenment, 
power and form. For an example, see Figure 1 . First, the purpose dimension spans along the 
enlightenment axis . At one end, quality improvement is embedded in the premise that 
understanding will be maximized where it is self-oriented ("Emancipatory"). At the other end, 
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quality evaluation is aimed at external auditing and recognition ("Technicist"). Secondly, the 
control dimension is represented along the power axis . At its opposite ends are methodologies 
essentially under the control of either the professional staff ("Collegial") or authorities 
external to them ("Administrative"). The third analytical dimension is the process of 
evaluation which can be analyzed along the form axis . Its two extremes consist of methods 
for quality assessment against the practical or gold standard of performance ("Bureaucratic") 
versus methods of quality improvement based on a universal or logical ideal ("Professional"). 
Essentially, in comparing the two models, it seems that the Economic model is mainly 
concerned with the endpoints along the continuums conceptualized in the Information model. 
Items reflecting both of these models were included in the questionnaire survey. 



Insert Figure 1 about here 



Ramsden (1991) has argued that questions on Pis taken out of context or data analysis 
based on individual responses are often misleading. Moreover, there are difficulties in 
interpreting students’ evaluations of individual faculty members (Pollitt, 1990; Warnock, 

1989). Therefore, in this study, the questions are addressed directly to the faculty and 
administrators which included items related their opinions about the performance evaluation 
system as a whole. Because Pis are essentially about the relative performance of aggregates, it 
is necessary that the question items and results are designed and analyzed at the aggregate 
level. 

In psychological experiments, demand characteristics refer to the cues and other 
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information used by the participants to guide their behaviors, or responses (Orne, 1962). In 
our questionnaire survey, it is explained that, on the basis of the demand characteristics of 
academic activities, academic workers in higher education can be divided into three groups: 
Administrators, Teachers and Researchers. Respondents were asked for self classification into 
one of these job categories. Although most respondents may engage in all of these activities, 
the involvement is often at various degrees. Therefore, it was hypothesized that the emphasis 
on their own job orientations and expectations would influence their opinions on the purposes, 
control and procedures of a performance evaluation system. It was also hypothesized that 
Administrators would favor the Economic model, Teachers would support the Information 
model and Researchers would prefer a hybrid approach. 

Methods 

Questionnaire 

The questionnaire consists of 54 items grouped into six parts: (i) purposes of 
performance evaluation (16 items: PI to P16), (ii) control and process of performance 
evaluation (21 items: C17 to C36. Item C37 for open comments is deleted), (iii) standards (1 
item: I38A to 1380, for 14 performance indicators and an open option), (iv) validity of Pis (2 
items: I39A to I39N and I40A to I40N, for "objective" and "subjective" validity of the 14 Pis, 
respectively), (v) overall opinions on the issues of purposes (P41), control (C42) and process 
(143) as well as satisfaction (S44) of existing performance evaluation systems, and (vi) 
demographic information (10 items: D1 to DIO, for job orientation, professional backgrounds, 
age, gender, annual income, external grants and subordinates, etc.). With three exceptions, all 
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items in the first four parts are designed as Likert-typed with 7 scales (1 to 7 with 4 as the 
neutral anchor). Items Cl 7 and Cl 8 ask about frequency of performance (from 1 to 4 years 
plus "Other"). Item 138 ask respondents to weight each of the 14 Pis (plus "Other", if 
necessary) that must sum to 100%. The four items in part (v) represent the continuum with 10 
anchor points to reflect the axes in Habermas ( 1 989) ’s Information model. 

Data Collection 

Three hundred questionnaire forms were sent to three universities in a prairie province 
of Canada: a large doctoral university with two campuses in the same city, one medium-sized 
urban non-doctoral university and one small rural four-year institution. Two populations in 
higher education were surveyed: "Administrators" (e.g., presidents, vice presidents, deans, 
associate deans, directors and department heads) and "Faculty" (e.g., professors, researchers, 
instructors) in Arts and Humanities. Addresses were recorded from the Internet Web sites of 
the universities, university catalogues and telephone directories. Questionnaire forms were 
mailed to all identified authorities for the "Administrators" sample. For the "Faculty" sample, 
questionnaire copies were sent to identified faculty as well as heads of departments/units for 
distribution. A downloadable version of the questionnaire was also made available on the 
Internet Web page of the first author. With the exception of the information about their 
positions and institutions, the identification of respondents were unknown, and all individual 
responses were kept confidential. 

Results 

There were 125 returns of which 9 were unusable due to wrong addresses or 
unidentified recipients, resulting in 116 completed, usable forms. The number of returns 



among the three universities, 65 (56%), 32 (28%) and 19 (16%), respectively, resembles the 
relative distributions of the targeted populations. Seventy six percent of respondents were 
males, 89% were employed full-time and 39% had external grants. On the average, the typical 
respondent had 20 years of experience, between 45 to 55 years old, with an annual income 
between $60,000 to $70,000 (Canadian funds) and had about 4 to 5 students or employees 
under direct supervision. They spent 41% of time on teaching, 30% on research, 11% on 
department-level service, 6% on faculty-level administration and 12% on other activities 
(university-level, community service etc.). Responses in the total sample were sorted into 
three subsamples according to the self-reported job orientation: Administrators, Teachers and 
Researchers. There were 10 respondents who failed to identify their job orientation, resulting 
in 106 usable observations for the subsamples. Demographic characteristics of the total and 
subsamples are given in Table 1. The division in job orientations and expectations is reflected 
in how working time was spent in the three subgroups: 65% of Administrators’ time was 
spent for service (Department-level, Institutional-level and Others), 50% of Teachers’ time for 
Teaching and 53% of Researchers’ time for Research. 



Insert Table 1 about here 



In the following, data analysis was conducted by means of descriptive statistics, t and 
ANOVA F tests, and factor analysis. Main results, and explanations of variables when 
necessary, are given in the tables. In an effort to find the most relatively important variables 
in each table, their arithmetic averages were ranked and compared within the total sample and 
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subsamples. Then, two statistical procedures were used for evaluating the hypothesis that 
Administrators favor the Economic model whereas Teachers, and to a lesser degree, 
Researchers support the Information model. First, one-way ANOVA F was analyzed to detect 
effects of job orientations and expectations across independent subsamples. Subsequently, 
latent classes formed across samples by factor analysis with varimax rotation were interpreted 
in light of their grouped variable components. For this purpose, the factorial groups were 
labelled on the basis of their component constitution. All statistical tests were evaluated at a 
= .05 (with Bonferonni adjustment for t test statistics). 

Relative Importance of PI Purposes . Nine potential goals of a performance evaluation system 
are listed in Table 2. Each of these items has seven scales, the first three ratings (1, 2 and 3) 
indicate "Unimportant" and the last three (5, 6, 7) signify "Important". From the total sample, 
they are grouped into three latent classes: "Improvement/Emancipatory" (Teaching, Research, 
Service), "Development" (Promotion, New and Tenured Faculty), and 
"Comparison/Technicist" (Intra-department, Inter-department and Inter-university). 

From the ordered item means, the components of "Comparison" are relatively 
unimportant (means < 4) whereas the three most important goal variables (means > 5) are 
identified as Teaching, New Faculty and Promotion. Only the rating of Service is significantly 
different among the three subsamples of Administrators (largest, mean > 5.3), Teachers and 
Researchers (smallest, mean < 3.5) according to the omnibus ANOVA F test. 

Results from the Economic model (presumably represented by Administrators’ 
responses) were quite different from those of the Information model (presumably supported 
by Teachers). First, the variables of improvement and monitoring in Research, Service and 




10 



Tenured Faculty were rated highest in the Administrators sample (means > 5). Secondly, 
whereas the latent grouping of the Teachers sample was the same as that of the total sample, 
the Administrators sample yields five classes: "Comparison-plus" (the three "Comparison" 
components plus Teaching), "Evaluation" (Research, New Faculty) and the remaining three 
latent classes, each has only one component. Researchers present a mixture of these two 
models, with Intra-department moves out of "Comparison" and into "Development." 
Moreover, Research was not in "Development" but became a class by itself. Ordered means 
and factorial groupings are depicted along the three axes of Administrators, Teachers and 
Researchers in Figure 2. a. 



Insert Table 2 about here 



Institutions that Determine PI Purposes . Participants were asked to rate the relative leverage 
of seven institutional bodies in making decision on purposes of performance evaluation (Table 
3). From all samples, Department-level unit was consistently rated highest (mean > 5.0) 
whereas public institutions (Government and Consulting Agency) were rated smallest (means 
< 2.0) (Figure 2.b). The rating of University-level is significantly different among the three 
subsamples according to the omnibus F test (rated highest by Administrators and lowest by 
Researchers). 

The seven institutions in Table 3 were grouped into four latent classes in the total 
sample: "Union/Emancipatory" (Faculty Union), "Authority/Technicist" (Board of Governors, 
University-level), "Professional" (Faculty-level, Department-level), and "Public" (Government 



and Consulting Agency). The disparity in the implications of the Economic and Information 
models can be detected by the fact that only Administrators supported Faculty-level 
determination of PI purposes; and the grouping of latent classes for the Administrators and 
Teachers samples are different (Figure 2.b). Although all other groupings in the total sample 
were maintained, Department-level was reclassified from "Professional" to "Union" in the 
Administrators sample, implying that departmental units would collaborate with Faculty 
Union (Emancipatory) whereas all other identities represented different levels of authority 
(Technicist). The four classes from the Teachers sample were: "Union/Emancipatory" (Union), 
"Higher Authority/Technicist" (Government, Board of Governors), "Lower 
Authority/Technicist" (University-level, Faculty-level) and "Professional/Emancipatory" 
(Department-level, Consulting Agency). Researchers reproduced two classes of Teachers’ 
grouping and modified the other two as "Authority 1" (Board of Governors, Faculty-level) 
and "Authority 2" (University-level, Department-level, Government). 



Insert Table 3 about here 



Frequency of Conduct the Performance Evaluation . All samples indicated that the evaluation 
frequency should be annually for non-tenured faculty and from two to three years for tenured 
faculty (Table 4). 



Insert Table 4 about here 



"Triggers" of Performance Evaluation . From a list of 11 possible "triggers" (Table 5) by 
which a formal evaluation process could be initiated, only two were supported (means > 5), 
Self-request or Automatically (i.e., by the passage of a calendar time). However, Researchers 
rated only Automatically above 5 among all triggers (Figure 2.c). 

From the total sample, 5 latent classes were found: "Routine" (Automatically), 
"Concerned Agents/Collegial" (Self Request, Peers, Students, Union), "Lower 
Authority/Administrative" (Department Head, Department-level, Faculty-level), "Higher 
Authority/Administrative" (University-level, Government) and "Public" (Granting Agencies). 
Teachers reproduced this grouping except that Students was found in "Public." On the other 
hand, there were eight classes in the Administrators sample. Besides "Lower 
Authority/Administrative" (without Faculty-level), and "High Authority/Administrative", the 
rest were single-component classes (Figure 2.c). Researchers conceptualized only four latent 
groups: "Routine" (Automatically), "Concerned Agents 1" (Self Request, Union, Government, 
Granting Agencies), "Concerned Agents 2" (Peers, Students) and "Authority/Administrative" 
(Department Head, Department-level, University-level, Faculty-level). 



Insert Table 5 about here 



Administrators of Performance Evaluation . There are eight institutions that presumably can 
administer the routine process of performance evaluation (Table 6). Results from all samples 
imply that the implementation of performance evaluation should be a Department-level 
responsibility. As expected, Faculty-level was also supported by Administrators (means > 5) 
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(Figure 2d). The rating of Faculty Union was significantly different among the three 
subsamples according to omnibus F test (with smallest rating by Administrators and largest by 
Researchers). Since the rating of Teachers is much closer to that of Researchers, this implies 
the underlying disparity of Economic and Information models on this issue. 

The four latent classes in the total sample were identified as "Union/Collegial" 

(Union), "Public” (Licencing Body, Government, Consulting Agency), "Higher 
Authority/Administrative" (Board of Governors, University-level), and "Lower 
Authority/Administrative" (Department-level, Faculty-level). In the Administrators sample, 
Licencing Body was assigned to "Higher Authority." Besides these two groups, the remaining 
variables form one-component classes. Teachers reproduced the classification of the total 
sample, except that "Lower Authority" disappeared since Government and Department-level 
formed a class and Faculty-level was moved into "Higher Authority." The latent grouping in 
the Researchers sample is quite similar to that of Teachers (Figure 2.d). 



Insert Table 6 about here 



Holders of Individual Performance Information . If a formal performance evaluation had been 
implemented, which institutions or identities could get access to the final information besides 
the individual faculty involved? Table 7 lists nine potential recipients. Access by university 
central administration (University-level) was endorsed in three samples (mean > 5), but 
clearly opposed by Teachers (mean = 1). On the other hand, Faculty Union access was 
supported by both Administrators and Teachers (means > 5) but only weakly by Researchers 
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(mean < 4.5) (Figure 2e). 

The nine variables were grouped into eight classes by factor analysis in three samples 
(except Administrators), with only one multiple class: "Concerned Agents" (Students, Private 
Agencies). In the Administrators sample, six latent groups were formed, with two multiple 
factors: "Concerned Private" (Student, Department Head, Private Agencies) and "Concerned 
Public" (Public, Government). This again implies a structural difference in the implications of 
the Economic (Administrators) and Information (Teachers and Researchers) models (Figure 
2.e). 



Insert Table 7 about here 



The Fourteen Performance Indicators: Relative Weights of Importance . Respondents were 
asked to assign a percentage to each of 14 Pis (summed to 100%) according to its relative 
importance in a performance evaluation system (Table 8). The two Pis uniformly rated with 
weights more than 10% were Course Evaluation and Book Publication. Added to this list 
were Grants in both Administrators and Researchers subsamples; and Peer-reviewed Journal 
Publication in Teachers and Researchers subsamples (Figure 2f). The three variables of 
Grants, Peer-reviewed Journal Publication and Number of Courses Taught were significantly 
different among the three subsamples according to omnibus F test. The ranking of ordered 
means can be used to explain this outcome as well as to analyze the departure of 
Administrators’ opinions from the those of Teachers and Researchers. Grants were weighted 
much larger by Administrators and Researchers than by Teachers, Number of Courses Taught 

15 




16 



was weighted highest by Administrators and lowest by Researchers whereas Peered-reviewed 
Journal Publication was rated much higher by Teachers and Researchers than by 
Administrators. 

Both the total and Teachers samples yield eight latent classes with three of them 
having multiple variables. In the total sample, they are "Formal Achievement" (Grants, Peer- 
reviewed Conference Presentation, Peer-reviewed Journal Publication, Book Publication), 
"Informal Achievement" (Non-peer reviewed Conference Presentation, Non-peer reviewed 
Journal Publication) and "Teaching Quality" (Student Supervision, Graduate Success). Besides 
"Formal Achievement", the two multiple-component classes formed by Teachers are 
"Recognition 1" (Non-peer reviewed Conference Presentation, Reputation) and "Recognition 
2" (Non-peer reviewed Conference Presentation, Award). The Researchers sample yields nine 
classes but the groupings are still quite similar to those in the Teachers sample (Figure 2.f). 
Finally, there are 10 latent classes in the Administrators sample with three multiple groups: 
"Recognition" (Years of Experience, Reputation), "Formal Achievement" (Grants, Peer- 
reviewed Journal Publication, Book Publication), and "Informal Achievement" (as above). 



Insert Table 8 about here 



The Fourteen Performance Indicators: "Subjective" Validity . Respondents were asked to rate 
the relative suitability of 14 Pis to their individual and institutional situations (Table 9). Peer- 
reviewed Journal and Book publications were rated highest (means > 5), and Graduate 
Success was rated lowest (mean < 3), with respect to "subjective validity" in all samples. 

16 




17 



Added to this list were Grants (rated by Administrators; mean > 5); and Years of Experience 
and Non-peer reviewed Conference Presentation (rated by Researchers; means < 3) (Figure 
2.g). The omnibus F tests were significant for Peer-reviewed Journal Publication (largest by 
Researchers and smallest by Administrators) and Number of Courses Taught (largest by 
Teachers and smallest by Researchers). 

Seven latent classes were found in the total sample, with four multiple classes, 
namely, "Formal Achievement" (Grants, Peer-reviewed Journal Publication, Book 
Publication), "Student Contribution" (Number of Students Supervised, Number of Courses 
Taught), "Work Quality" (Community Service, Course Evaluation by Students) and 
"Recognition" (Graduate Success, Reputation, Award). Besides "Formal Achievement" and 
"Student Contribution", the other two multiple classes, out of a total of six, in the Teachers 
sample were "Informal Achievement" (Non-peer reviewed Conference Presentation, Non-peer 
reviewed Journal Publication) and "Contribution-plus" (Peer-reviewed Conference 
Presentation, Community Service, Course Evaluation, Award). Researchers conceptualized 
seven latent classes with four multiple groupings (Figure 2.g). Besides "Formal Achievement 
and "Work Quality" as above, the other groups were "Informal Contribution" (Non-peer 
reviewed Conference, Students-supervised), and "Recognition" (Graduate Success, 

Reputation). On the other hand, there were 10 latent classes in the Administrators sample with 
only one multiple group of "Achievement-plus" (Grants, Peer-reviewed Journal Publication, 
Peer-reviewed Conference Presentation, Number of Students Supervised). 



Insert Table 9 about here 



The Fourteen Performance Indicators: "Objective" Validity. Respondents were asked to rate 



the relative suitability of the listed Pis to general academic settings (Table 10). The Pis that 
received highest ratings (means > 5) for "objective" validity in all samples were Peer- 
reviewed Journal Publication, Peer-reviewed Conference Presentation, Book Publication, 
Number of Students Supervised, and Reputation. As expected, Grants were highly rated by 
Administrators (Figure 2h). Both Grants (largest for Administrators) and Number of Courses 
Taught (largest for Teachers and smallest for Researchers) were statistically significant by the 
omnibus F test. On the other hand, Award, Course Evaluation, and Graduate Success were 
unanimously dismissed as objectively-valid performance indicators across all samples (means 
- 1 ). 

Seven latent classes were formed in the total sample, with four multiple groups, 
namely, "Formal Achievement" (Grants, Peer-reviewed Journal Publication, Peer-reviewed 
Conference), "Informal Achievement" (Years of Experience, Non-peer reviewed Conference 
Presentation, Non-peer reviewed Journal Publication), "Teaching Quality" (Graduate Success, 
Course Evaluation by Students), and "Recognition" (Number of Courses Taught, Reputation). 
Eight classes were factorially grouped in the Administrators subsample, with three multiple 
groups of "Achievement-plus" (Grants, Peer-reviewed Journal Publication, Peer-reviewed 
Conference Presentation, Number of Courses Taught, Reputation), "Informal Achievement" 
(Non-peer reviewed Conference Presentation, Non-peer reviewed Journal Publication), and 
"Teaching Quality" (as above). The Teachers sample produced 13 classes with only one 
multiple group (namely "NPR Achievement"). Among the 12 factorial classes in the 
Researchers sample, two were multiple groups, namely "Informal Achievement" and "Course 
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Contribution" (Number of Courses Taught, Course Evaluation by Students) (Figure 2.f). 



Insert Table 10 and Figure 2 about here 



It is of interest to study the difference in the two sets of validity ratings. The t test 
statistics of means difference for independent groups were found significant for the ratings of 
Number of Students Supervised, Graduate Success, Course Evaluation by Students and 
Reputation across all samples. This statistical significance implies that the roles and validity 
of these Pis for performance evaluation varied relative to the identified parties, namely, 
whether they were viewed by individuals or institutions. Only the variable of Non-peer 
reviewed Journal Publication is significantly different among the three subsamples according 
to the omnibus F test on difference scores. However, this finding is not meaningful due to its 
low ratings in all samples (mean < 4 for both subjective and objective validity settings). 



Insert Table 1 1 about here 



Structural Factors of A Performance Evaluation System . What are the overall characteristics 
of a system of performance evaluation in higher education if they are measured along the 
three axes of Purposes, Control, Procedures as conceptualized by Habermas (1989)? (Table 
12). For the scales from 1 to 10, respondents seemed to feel neutral between 
"Evaluative/Technicist" and "Informational/Emancipatory" along the Purposes continuum (5 < 
means < 6), showed a slight preference for "Intemal/Collegial" control rather than 
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"External/Administrative" control (means < 4) and were indifferent in the choice between 
"Professional" and "Institutional/Bureaucratic" procedures or standards of evaluation (means - 
5). These findings are found uniformly across all samples. Is the existing system of 
performance evaluation satisfactory? Teachers and Researchers tended to be neutral (means > 
5) whereas Administrators seem to be somewhat dissatisfied (mean < 5). 



Insert Table 12 about here 



Summary and Conclusions 

From the survey findings, it is evident that a performance evaluation system should be 
designed with the aim of improving teaching, monitoring the development of new faculty and 
making promotion and tenure decisions. It should not be used for comparison at departmental 
or higher levels. The goals of a performance evaluation system should be determined at 
departmental level and independent of governmental or private third-party influence. It is 
sufficient to conduct an evaluation annually for non-tenured faculty and at least every two- 
year interval for tenured faculty. Besides routine and informal performance review, it is only 
appropriate to open a formal evaluation upon a self request by the faculty. The administration 
of such a formal evaluation is most suitable if it is administered at the departmental level. It 
is preferred that only the individuals themselves could have access to information about their 
own performance evaluation. 

There is a widely-held belief that performance quality in higher education is a many- 
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sided yet ultimately elusive phenomenon. This conviction has led several researchers to doubt 
whether unambiguous scales of measurement suitable as Pis could ever be derived (Cave et 
al., 1988; Smith, 1988). This conclusion seems altogether too pessimistic. Although how to 
identify "good" performance indicators and performance evaluation systems is undoubtedly a 
complicated matter, this study shows a substantial measure of agreement among those 
surveyed about their essential characteristics. The most relevant Pis were found to be 
publication in peered-reviewed journals and books as well as course evaluation by students. 
Although typical respondents considered publication records (peer-reviewed journals and 
books) as the most appropriate performance measures for themselves, other indicators such as 
peer-reviewed conference participation, number of students supervised, and reputation among 
peers would also be suitable in general. On the other hands, such Pis as award, merits, public 
recognition, and career success of former students were unanimously dismissed as objectively- 
valid performance yardsticks. The role of course evaluation by students as a performance 
indicator is unclear. It was rated highest for its relative importance among 14 Pis in all 
samples except Researchers, and lowest as a subjectively-valid measure of performance in all 
samples. 

This study demonstrates that job orientations and expectations of participants in higher 
education would influence their views on purposes, control sources and implementation 
procedures of a performance evaluation system. It is evident that Administrators would favor 
the Economic model. Teachers would support the Information model and Researchers would 
prefer a hybrid approach of both. 

In this pilot study, our conclusions may only apply to the three universities involved. 
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However, it is hoped that our findings, based on relatively long and sophisticated 
questionnaire, will provide some answers to Benjamin (1996)’s four issues with respect to the 
commonly-held definition for Pis presented previously. This model-based study will set the 
stage for our next investigation in which the same questionnaire will be sent to a larger 
number of departments and universities. 
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Table 1 . Demographic characteristics of survey respondents 1 



Variable 


Total 
N = 116 
Mean 
(%) 


Administrs 
N = 16 
Mean 
(%) 


Teachers 
N = 66 
Mean 
(%) 


Researchers 
N = 24 
Mean 
(%) 


Dl. PLace of Employment 


Large University 


65 


8 


36 


14 


Medium University 


32 


7 


17 


5 


Small University 


19 


1 


13 


5 


D2. Employment Status 


Full-time 


103 


16 


60 


19 


Non Full-time 


13 


0 


6 


5 


D3. Years Employed 


19.90 


15.71 


21.14 


20.52 


D4. Age 


54 


50 


55 


53 


D5. Gender 


Males 


88 


15 


46 


18 


Females 


28 


1 


20 


6 


D6. Annual Income 


65K 


60 K 


70K 


60K 


D7. External Grants 


(0.39) 


(0.19) 


(0.44) 


(0.32) 


D8. Subordinates 


4.68 


2.31 


3.52 


10.13 


D9. Time spent on duties 


Teaching 


(0.41) 


(0.23) 


(0.50) 


(0.28) 


Research 


(0.30) 


(0.09) 


(0.26) 


(0.53) 


Department-level 


(0.11) 


(0.21) 


(0.10) 


(0.09) 


Faculty-level 


(0.06) 


(0.03) 


(0.07) 


(0.07) 


Institutional-level 


(0.05) 


(0.13) 


(0.03) 


(0.02) 


Others 


(0.07) 


(0.31) 


(0.04) 


(0.01) 



1 In this and all subsequent tables, the sum of subsample frequency counts in each row is not 
equal to the total sample count due to 10 missing values for self-reported job orientation. 
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Table 2 . Relative importance of the various purposes served by performance evaluation. 





Total 


Administrs 


Teachers 


Researchers 






N = 116 


N = 16 


N = 66 


N = 24 




Variable 


Mean 


Mean 


Mean 


Mean 


F 




(SD) 


(SD) 


(SD) 


(SD) 


(E) 




Factor 


Factor 


Factor 


Factor 





PI 


Teaching 


5.71 


FI 


6.25 


FI 


5.66 


FI 


5.43 


FI 


1.70 




(1.38) 




(1.24) 




(1.45) 




(1.31) 




(.189) 


P2 


Research 


4.41 


FI 


5.37 


F2 


4.23 


FI 


4.46 


F2 


2.07 






(2.08) 




(1.93) 




(2.07) 




(1.93) 




(.131) 


P3 


Service 


3.97 


FI 


5.40 


F3 


3.91 


FI 


3.43 


FI 


5.46 






(1.96) 




(1.24) 




(1.96) 




(1.83) 




(.006) 


P4 


Promotion 


5.17 


F2 


5.19 


F4 


5.01 


F2 


5.37 


F3 


0.40 






(1.69) 




(1.64) 




(1.76) 




(1.66) 




(.673) 


P5 


New Faculty 


5.30 


F2 


5.31 


F2 


5.17 


F2 


5.37 


F3 


0.18 




(1.52) 




(1.35) 




(1.60) 




(1.55) 




(.836) 


P6 


Tenured Faculty 


4.88 


F2 


5.19 


F5 


4.83 


F2 


4.79 


F3 


0.31 






(1.73) 




(1.47) 




(1.80) 




(1.59) 




(.731) 


P7 


Intra-department 


3.27 


F3 


2.93 


FI 


3.18 


F3 


3.42 


F3 


0.34 




(1.89) 




(1.53) 




(1.78) 




(2.04) 




(.714) 


P8 


Inter-department 


2.50 


F3 


2.80 


FI 


2.48 


F3 


2.37 


F4 


0.38 




(1.58) 




(1.78) 




(1.51) 




(1.34) 




(.686) 


P9 


Inter-university 


2.261 


F3 


2.40 


FI 


2.17 


F3 


2.50 


F4 


0.43 






(1.60) 




(1.76) 




(1.55) 




(1.64) 




(.651) 



PI = to improve teaching, P2 = to improve research, P3 = to improve service, P4 = to make 
promotion & tenure decisions, P5 = to monitor development of new faculty overtime, P6 = to 
monitor continued performance of tenured faculty, P7 = to compare performance of 
individuals within departments, P8 = to compare performance between departments, and P9 = 
to compare performance quality between universities. 

F = one-way omnibus test for ANOVA design of three subsamples, = p-value of F. 
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Table 3. Relative importance of institutions that determine purposes of performance 



evaluation. 







Total 


Administrs 


Teachers 


Researchers 








N = 116 


N = 16 


N = 66 


N = 24 




Variable 


Mean 


Mean 


Mean 


Mean 


F 






(SD) 


(SD) 


(SD) 


(SD) 


(e) 






Factorial Factorial Factorial Factorial 






Grouping Grouping Grouping Grouping 


P10 


Faculty Union 


4.20 FI 


3.33 FI 


4.54 FI 


4.23 FI 


1.81 






(2.28) 


(1.84) 


(2.27) 


(2.37) 


(.170) 


Pll 


Board of Governors 


2.47 F2 


3.07 F2 


2.39 F2 


2.24 F2 


1.06 






(1.85) 


(2.19) 


(1.73) 


(1.67) 


(.349) 


P12 


University-level 


3.12 F2 


4.56 F2 


2.89 F3 


2.60 F3 


6.21 






(1.95) 


(2.06) 


(1.79) 


(1.82) 


(.003) 


P13 


Faculty-level 


4.65 F3 


5.37 F3 


4.51 F3 


4.45 F2 


1.70 






(1.78) 


(1.31) 


(1.85) 


(1.71) 


(.188) 


P14 


Department-level 


5.19 F3 


5.07 FI 


5.18 F4 


5.18 F3 


0.02 






(1.88) 


(1.94) 


(1.87) 


(1.89) 


(.976) 


P15 


Government 


1.78 F4 


1.73 F4 


1.80 F2 


1.67 F3 


0.09 






(1.39) 


(1.39) 


(1.44) 


(0.91) 


(.917) 


P16 


Consult Agency 


1.85 F4 


1.80 F4 


1.94 F4 


1.90 F4 


0.04 






(1.57) 


(1.78) 


(1.68) 


(1.37) 


(.957) 



P10 = Faculty Union, Pll = Board of Governors, P12 = Central University Administration, 
PI 3 = Faculty-level Administration, P14 = Departmental-level Units/Committees, P15 = 
Government Department(s) of Education, P16 = Third-party Consulting Agency. 



Table 4 . Frequency of performance evaluation of non-tenured and tenured faculty 





Total 


Administrs 


Teachers 


Researchers 






N = 116 


N = 16 


N = 66 


N = 24 




Variable 


Mean 


Mean 


Mean 


Mean 


F 




(SD) 


(SD) 


(SD) 


(SD) 


(E) 



C17 


Non-tenured 


1.32 


1.31 


1.37 


1.26 


0.16 






(0.83) 


(1.01) 


(0.83) 


(0.86) 


(.855) 


C18 


Tenured 


2.51 


2.25 


2.48 


2.62 


0.41 






(1-27) 


(1.39) 


(1.23) 


(1.31) 


(.662) 



C17 = Frequency to conduct a performance evaluation for non-tenured faculty, C18 - 
Frequency to conduct a performance evaluation for tenured faculty. 
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Table 5 . Triggers of formal performance evaluation for informational purposes 





Total 


Administrs Teachers 


Researchers 




N = 116 


N = 16 N = 66 


N = 24 


Variable 


Mean 


Mean Mean 


Mean F 




(SD) 


(SD) (SD) 


(SD) (e) 




Factorial Factorial Factorial Factorial 




Grouping Grouping Grouping Grouping 



C19A 


Automatically 


5.54 


FI 


5.62 


FI 


5.77 FI 


5.097 


FI 


1.33 






(1.77) 




(2.00) 




(1.56) 


(1.88) 




(.268) 


C19B 


Self-request 


5.56 


F2 


5.69 


F2 


5.78 F2 


4.87 


F2 


2.37 




(1.74) 




(1.70) 




(1.63) 


(2.05) 




(.099) 


C19C 


Dept. Head 


2.93 


F3 


3.12 


F3 


2.94 F3 


2.78 F3 


0.16 




(1.89) 




(1.59) 




(1.91) 


(1.95) 




(.854) 


C19D 


Deptment-level 


3.15 


F3 


2.688 


F3 


3.540 F3 


1.820 


F3 


2.52 




(1.88) 




(1.49) 




(1.94) 


(2.70) 




(.086) 


C19E 


University-level 


2.159 


F4 


2.313 


F4 


2.000 F4 


2.348 


F3 


0.54 




(1.66) 




(1.85) 




(1.42) 


(1.77) 




(.582) 


C19F 


Faculty-level 


2.99 


F3 


3.00 


F5 


3.09 F3 


2.43 


F3 


1.16 




(1.86) 




(1.63) 




(1.81) 


(1.80) 




(.316) 


C19G 


Peers 


3.33 


F2 


3.19 


F6 


3.44 F2 


3.087 


F4 


0.31 






(1.98) 




(1.87) 




(1.93) 


(2.17) 




(.736) 


C19H 


Students 


4.07 


F2 


4.37 


F7 


4.16 F5 


3.35 


F4 


1.36 






(2.20) 




(2.33) 




(2.17) 


(2.35) 




(.261) 


Cl 91 


Faculty Union 


3.009 


F2 


2.750 


F7 


2.953 F2 


3.304 


F2 


0.40 




(2.05) 




(1.88) 




(194) 


(2.32) 




(.671) 


C19J 


Government 


1.442 


F4 


1.438 


F4 


1.484 F4 


1.478 


F2 


0.02 






(0.92) 




(1.26) 




(0.89) 


(0.95) 




(.985) 


C19K 


Granting Agencies 


2.29 


F5 


2.75 


F8 


2.27 F5 


2.13 


F2 


0.67 






(1.70) 




(2.41) 




(1.55) 


(1.66) 




(.516) 



C19A = automatically each year, C19B = on request by the faculty member, C19C = by 
department head, C19D = by a departmental committee, C19E = from a university-level 
administrative unit, C19F = from the faculty dean/director,Cl9G = by formal request from 
peers, C19H = by formal request from students, C 1 91 = by faculty union, C19J = by 
government, C19K = by request from granting agencies. 
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Table 6 . Ratings of institutions which can administer the routine process of performance 
evaluation 





Total 


Administrs Teachers 


Researchers 




N = 116 


N = 16 N = 66 


N = 24 


Variable 


Mean 


Mean Mean 


Mean F 




(SD) 


(SD) (SD) 


(SD) (e) 




Factorial Factorial Factorial Factorial 




Grouping Grouping Grouping Grouping 



C20 


Faculty Union 


3.27 


FI 


1.87 


FI 


3.52 


FI 


4.00 


FI 


4.51 




(2.29) 




(1.51) 




(2.28) 




(2.43) 




(.013) 


C21 


Licencing Body 


3.35 


F2 


2.67 


F2 


3.55 


F2 


2.83 


F2 


1.69 






(2.15) 




(1.88) 




(2.12) 




(2.15) 




(.190) 


C22 


Board of Governors 


2.09 


F3 


2.33 


F2 


2.09 


F3 


1.74 


F3 


0.68 






(1.66) 




(1.88) 




(1.68) 




(1.18) 




(.507) 


C23 


University-level 


2.58 


F3 


3.47 


F2 


2.44 


F3 


2.26 


F3 


2.07 






(1.97) 




(2.39) 




(1.87) 




(1.74) 




(.131) 


C24 


Faculty-level 


4.62 


F4 


5.53 


F3 


4.59 


F3 


4.43 


F3 


1.62 




(2.02) 




(1.64) 




(2.14) 




(1.70) 




(.204) 


C25 


Department-level 


5.22 


F4 


5.00 


F4 


5.28 


F4 


5.08 


F2 


0.22 




(1.74) 




(2.07) 




(1.28) 




(1.72) 




(.801) 


C26 


Local Government 


1.29 


F2 


1.47 


F5 


1.28 


F4 


1.35 


F3 


(0.74) 






(1.30) 




(0.68) 




(0.57) 




(.699) 






C27 


Consult Agency 


1.78 


F2 


1.47 


F5 


1.96 


F2 


1.61 


F4 


0.74 




(1.65) 




(1.55) 




(1.87) 




(1.23) 




(.482) 
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Table 7 . Ratings of institutions which can access individual performance evaluation 
information 



Total Administrs Teachers Researchers 

N = 1 16 N = 16 N = 66 N = 24 



Variable 


Mean 


Mean 


Mean 


Mean 


F 




(SD) 


(SD) 


(SD) 


(SD) 


(£ 






Factorial 


Factorial 


Factorial 


Factorial 






Grouping 


Grouping 


Grouping 


Grouping 



C28 


Students 


3.64 

(2.46) 


FI 


3.31 

(2.65) 


FI 


3.95 

(2.48) 


FI 


3.25 

(2.29) 


FI 


0.93 

(.399) 


C29 


Other Faculty 


3.35 

(2.15) 


F2 


2.67 

(1.88) 


F2 


3.55 

(2.12) 


F2 


2.83 

(2.15) 


F2 


1.69 

(.190) 


C30 


Department Head 


2.09 

(1.66) 


F3 


2.33 

(1.88) 


FI 


2.09 

(1.68) 


F3 


1.74 

(1.18) 


F3 


0.68 

(.507) 


C31 


Faculty-level 


2.58 

(1.97) 


F4 


3.47 

(2.39) 


F3 


4.59 

(2.14) 


F4 


2.26 

(1.74) 


F4 


2.07 

(.131) 


C32 


Faculty Union 


4.62 

(2.02) 


F5 


5.53 

(1.64) 


F4 


5.28 

(1.67) 


F5 


4.43 

(1.70) 


F5 


1.62 

(.204) 


C33 


University-level 


5.22 

(1.74) 


F6 


5.00 

(2.07) 


F5 


1.28 

(0.68) 


F6 


5.08 

(1.72) 


F6 


0.22 

(.801) 


C34 


Public 


1.29 

(0.74) 


F7 


1.47 

(1.30) 


F6 


1.97 

(1.87) 


F7 


1.35 

(0.57) 


F7 


0.36 

(.699) 


C35 


Government 


1.78 

(1.65) 


F8 


1.47 

(1.55) 


F6 


1.97 

(1.87) 


F8 


1.61 

(1.23) 


F8 


0.74 

(.482) 


C36 


Private Agencies 


3.64 

(2.46) 


FI 


3.31 

(2.65) 


FI 


3.95 

(2.48) 


FI 


3.25 

(2.29) 


FI 


0.93 

(.399) 
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Table 8 . Weights (in percentages) of 14 performance indicators 





Total 


Administrs Teachers 


Researchers 




N = 116 


N = 16 N = 66 


N = 24 


Variable 


Mean 


Mean Mean 


Mean F 




(SD) 


(SD) (SD) 


(SD) (£) 




Factorial Factorial Factorial Factorial 




Grouping Grouping Grouping Grouping 



I38A Experience 


7.38 


FI 


3.76 


FI 


7.80 


FI 


8.56 


FI 


0.30 




(18.85) 




(4.91) 




(20.17) 




(22.60) 




(.743) 


I38B Grants 


8.41 


F2 


11.12 


F2 


6.96 


F2 


10.65 


F2 


5.47 




(5.84) 




(7.34) 




(5.50) 




(4.84) 




(.006) 


I38C PR Conference 


6.74 


F2 


5.76 


F3 


6.96 


F2 


6.52 


F2 


0.56 




(3.83) 




(3.85) 




(3.87) 




(4.11) 




(.575) 


I38D NPR Conference 


3.51 


F3 


2.48 


F4 


4.12 


F3 


2.61 


F3 


2.31 




(3.49) 




(2.50) 




(3.78) 




(3.22) 




(.105) 


I38E PR Journal Pub 


14.30 


F2 


9.69 


F2 


13.51 


F2 


18.17 


F3 


4.13 




(9.76) 




(8.17) 




(9.59) 




(7.85) 




(.019) 


I38F NPR Journal Pub 


3.69 


F3 


2.84 


F4 


4.07 


F4 


3.13 


F4 


1.14 




(3.34) 




(3.16) 




(3.52) 




(3.22) 




(.323) 


I38G Book Publication 


11.86 


F2 


10.05 


F2 


11.00 


F2 


15.13 


F4 


2.48 




(8.28) 




(8.96) 




(7.38) 




(10.04) 




(.091) 


I38H Community Service 


6.36 


F4 


6.29 


F5 


7.27 


F5 


4.52 


F5 


2.35 




(5.20) 




(5.41) 




(5.39) 




(4.46) 




(.100) 


1381 Students-suprvised 


5.71 


F5 


6.02 


F6 


5.53 


F6 


4.78 


F6 


0.33 




(4.92) 




(5.69) 




(4.85) 




(3.84) 




(.716) 


I38J Graduate Success 


2.30 


F5 


1.50 


F7 


2.34 


F7 


2.26 


F7 


0.26 




(3.87) 




(3.03) 




(4.28) 




(3.28) 




(.768) 


I38K Courses-taught 


6.26 


F6 


11.88 


F8 


6.02 


F8 


3.04 


F8 


3.89 




(9.56) 




(18.82) 


(7.51) 




(4.19) 




(.024) 


I38L Course Evaluation 


13.15 


F7 


12.17 


F9 


14.58 


F5 


12.17 


F5 


0.52 




(11.23) 




(11.47) 


(11.64) 


(10.75) 


(.599) 


I38M Reputation 


3.74 


F8 


3.93 


FI 


3.84 


F3 


3.26 


F5 


0.11 




(5.14) 




(5.94) 




(5.48) 




(4.16) 




(.893) 


I38N Awards 


6.04 


F8 


6.07 


F10 


6.21 


F4 


5.96 


F9 


0.02 




(4.78) 




(4.01) 




(5.25) 




(4.33) 




(.977) 
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Table 8 (continued) 



Tables 8, 9, 10 and 11 contain the same following variables: A = Years of experience, 
B = External research grants, C = Peer-reviewed conference presentations, D = Non peer- 
reviewed conference presentations, E = Peer-reviewed journal publication, F = Non peer- 
reviewed journal publication, G = Book publication, H = Community service, I = Number of 
students currently supervised, J = Career success of former students, K = number of courses 
taught, L = Course evaluation by students, M = Reputation of faculty among peers, N = 
Awards, merits, public recognition. 
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Table 9 . "Subjective" validity ratings of 14 performance indicators 



Total Administrs Teachers Researchers 

N = 1 16 N = 16 N = 66 N = 24 



Variable 


Mean 


Mean 


Mean 


Mean 


F 




(SD) 


(SD) 


(SD) 


(SD) 


(R 






Factorial 


Factorial 


Factorial 


Factorial 






Grouping 


Grouping 


Grouping 


Grouping 



I39A 


Experience 


3.34 


FI 


3.00 


FI 


3.61 


FI 


2.83 


FI 


1.50 






(2.04) 




(2.31) 




(1.99) 




(1.97) 




(.229) 


I39B 


Grants 


4.40 


F2 


5.17 


F2 


3.98 


F2 


4.83 


F2 


3.48 






(1.81) 




(1.47) 




(1.88) 




(1.63) 




(.035) 


I39C 


PR Conference 


4.71 


F3 


4.83 


F2 


4.73 


F3 


4.50 


F3 


0.24 






(1.65) 




(1.80) 




(1.63) 




(1.47) 




(.790) 


I39D 


NPR Conference 


3.23 


F4 


3.42 


F3 


3.33 


F4 


2.68 


F4 


1.42 






(1.70) 




(1.00) 




(1.71) 




(1.64) 




(.247) 


I39E 


PR Journal 


5.78 


F2 


5.50 


F2 


5.61 


F2 


6.17 


F2 


1.70 






(1.38) 




(1.38) 




(1.51) 




(0.70) 




(.189) 


I39F 


NPR Journal 


3.53 


F4 


3.92 


F4 


3.45 


F4 


3.48 


FI 


0.42 






(1.65) 




(0.79) 




(1.62) 




(1.88) 




(.656) 


I39G 


Book Publication 


5.59 


F2 


5.50 


F5 


5.45 


F2 


5.71 


F2 


0.20 






(1.67) 




(1.78) 




(1.74) 




(1.46) 




(.817) 


I39H 


Community Service 


4.16 


F5 


4.00 


F4 


4.37 


F3 


3.65 


F5 


1.48 






(1.77) 




(1.73) 




(1.80) 




(1.69) 




(.233) 


1391 


Students-suprvised 


3.89 


F6 


3.67 


F2 


4.03 


F5 


3.26 


F4 


1.57 






(1.83) 




(1-92) 




(1.82) 




(1.74) 




(.213) 


I39J 


Graduate Success 


2.87 


F7 


2.58 


F6 


2.75 


F6 


2.92 


F6 


0.15 






(1.78) 




(1-62) 




(1.70) 




(2.00) 




(.858) 


I39K 


Courses-taught 


3.74 


F6 


3.83 


F7 


3.95 


F5 


2.74 


F7 


4.63 






(1.74) 




(1.75) 




(1.67) 




(1.54) 




(.012) 


I39L 


Course Evaluation 


4.77 


F5 


4.67 


F8 


4.84 


F3 


4.61 


F5 


0.15 






(1.85) 




(1.87) 




(1.94) 




(1.67) 




(.859) 


I39M 


Reputation 


3.82 


F7 


3.67 


F9 


3.81 


F6 


3.58 


F6 


0.13 






(1.92) 




(1.72) 




(1.93) 




(1.95) 




(.882) 


I39N 


Awards 


4.52 


F7 


4.61 


F10 


4.36 


F3 


4.54 


FI 


0.17 






(1.76) 




(1.26) 




(1.86) 




(1.59) 




(.847) 
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Table 10 . "Objective" validity ratings of 14 performance indicators 





Total 


Administrs 


Teachers 


Researchers 






N = 116 


N = 16 


N = 66 


N = 24 




Variable 


Mean 


Mean 


Mean 


Mean 


F 




(SD) 


(SD) 


(SD) 


(SD) 


(I 



Factorial Factorial Factorial Factorial 

Grouping Grouping Grouping Grouping 



I40A Experience 
MOB Grants 

I40C PR Conference Pres 
MOD NPR Conference 
I40E PR Journal 
I40F NPR Journal 
I40G Book Publication 
MOH Community Service 
1401 Students-supervised 
I40J Graduate Success 
I40K Courses-taught 
MOL Course Evaluation 
MOM Reputation 



3.60 


FI 


3.36 


(2.30) 




(2.42) 


4.73 


F2 


5.73 


(1.71) 




(1.10) 


5.15 


F2 


5.36 


(1.44) 




(1.12) 


3.45 


FI 


3.00 


(1.75) 




(1.34) 


5.59 


F2 


5.82 


(1.33) 




(1.08) 


3.48 


FI 


2.73 


(1.75) 




(1.35) 


5.36 


F2 


5.09 


(1.55) 




(1.04) 


4.02 


F3 


4.18 


(1.81) 




(1.33) 


5.79 


F4 


6.19 


(2.55) 




(3.17) 


1.16 


F5 


1.06 


(0.37) 




(0.25) 


4.54 


F6 


4.50 


(4.43) 




(3.20) 


1.07 


F5 


1.063 


(0.25) 




(0.25) 


5.14 


F6 


5.06 


(2.42) 




(2.43) 


1.05 


F7 


1.12 


(0.22) 




(0.34) 



FI 


3.79 

(2.23) 


FI 


F2 


4.47 

(1.82) 


F2 


F2 


5.15 

(1.44) 


F3 


F3 


3.54 

(1.66) 


F4 


F2 


5.47 

(1.40) 


F5 


F3 


3.58 

(1.63) 


F4 


F4 


5.30 

(1.50) 


F6 


F5 


4.27 

(1.82) 


F7 


F6 


5.75 

(2.35) 


F8 


F7 


1.12 

(0.33) 


F9 


F2 


4.06 

(2.41) 


F10 


F7 


1.06 

(0.24) 


FI 1 


F2 


5.18 

(2.32) 


F12 


F8 


1.03 

(0.17) 


F13 



3.22 


FI 


0.59 


(2.41) 




(.555) 


4.65 


F2 


2.50 


(1.61) 




(.088) 


4.91 


F3 


0.43 


(1.47) 




(.655) 


3.30 


F4 


0.52 


(2.03) 




(.595) 


5.61 


F5 


0.33 


(1.41) 




(.719) 


3.48 


F4 


1.13 


(2.06) 




(.327) 


5.48 


F6 


0.24 


(1.83) 




(.784) 


3.48 


F7 


1.65 


(1.90) 




(.198) 


5.87 


F8 


0.19 


(2.72) 




(.828) 


1.29 


F9 


2.62 


(0.46) 




(.077) 


4.27 


F10 


0.19 


(2.75) 




(.823) 


1.12 


F10 


0.54 


(0.34) 




(.587) 


5.08 


FI 1 


0.03 


(2.68) 




(.975) 


1.04 


F12 


1.29 


(0.20) 




(.280) 



MON Awards 



Table 11 . Differences in "subjective" and "objective" validity ratings of 14 performance 
indicators 



Variable 


Total 
N = 116 
Mean(d) 
t(d) 


Administrs 
N = 16 
Mean(d) 
t(d) 


Teachers 
N = 66 
Mean(d) 
t(d) 


Researchers 
N = 24 
Mean(d) 
t(d) 


F 

(P) 


I40A Experience 


-0.23 


-0.36 


-0.15 


-0.48 


0.25 




-1.13 


-0.40 


-0.61 


-1.75 


(.780) 


MOB Grants 


-0.24 


- 0.10 


-0.47 


0.26 


1.69 




-1.48 


-0.43 


-1.98 


0.92 


(.190) 


I40C PR Conference 


-0.35 


0.10 


-0.39 


-0.48 


0.56 




-2.36+ 


0.36 


- 2 . 00 + 


-1.39 


(.571) 


MOD NPR Conference 


-0.26 


0.30 


-0.25 


-0.76 


2.36 




- 1.86 


0.71 


-1.58 


-2.31 + 


(. 101 ) 


I40E PR Journal 


0.22 


- 0.20 


0.19 


0.56 


1.20 




1.63 


-0.80 


1.08 


1.62 


(.306) 


I40F NPR Journal 


0.00 


1.10 


-0.17 


-0.18 


5.10 




0.00 


2.09 


-1.15 


-0.78 


(.008) 


I40G Book Publication 


0.24 


0.70 


0.15 


0.22 


0.46 




1.49 


1.48 


0.67 


0.68 


(.634) 


I40H Community Service 


0.11 


-0.18 


0.03 


0.09 


0.11 




0.66 


-0.28 


0.16 


0.37 


(.891) 


1401 Students-supervised 


-1.89 


-2.50+ 


- 1.68 


-2.59 


0.99 




-6.62*** 


-2.57 


-4.42*** 


-4 75 *** 


(.376) 


I40J Graduate Success 


2.71 


2.50 


2.64 


2.62 


0.03 




15.51*** 


5.33** 


12.08*** 


6.24*** 


(.970) 


I40K Courses-taught 


-0.71 


-0.33 


- 0.11 


-1.57 


2.07 




-1.58 


-0.33 


-0.35 


-2.19+ 


(.132) 


MOL Course Evaluation 


4.70 


4.58 


4.80 


4.48 


0.27 




26.48*** 


8.67*** 


19.65*** 


13.03*** 


(.764) 


MOM Reputation 


-1.14 


-0.83 


-1.24 


-1.50 


0.18 




-3.75* 


-0.83 


-3.01++ 


-2.54+ 


(.836) 


MON Awards 


4.48 


4.46 


4.35 


4.50 


0.07 




25.99*** 


12.09*** 


18.23 


14.13 


(.929) 



+ = p < .05, ++ = p < .01. * = p < .001, ** = p < .0001 
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Table 12 . Overall ratings of structural factors that characterize a performance evaluation 
system 





Total 


Administers 


Teachers 


Researchers 






N = 116 


N = 16 


N = 66 


N = 24 




Variable 


Mean 


Mean 


Mean 


Mean 


F 




(SD) 


(SD) 


(SD) 


(SD) 


(E) 



Factorial Factorial Factorial Factorial 

Grouping Grouping Grouping Grouping 



P41 


Purposes 


5.79 


6.19 


5.74 


5.87 


0.19 






(2.55) 


(3.17) 


(2.35) 


(2.72) 


(.828) 


C42 


Control 


4.54 


4.50 


4.06 


4.27 


0.19 






(4.43) 


(3.20) 


(2.41) 


(2.75) 


(.823) 


143 


Procedures 


5.14 


5.06 


5.18 


5.08 


0.03 






(2.42) 


(2.43) 


(2.32) 


(2.68) 


(.975) 


S44 


Satisfaction 


5.03 


4.56 


5.25 


4.96 


0.62 






(2.34) 


(2.87) 


(2.21) 


(1.97) 


(.539) 



P41 = A continuum that describes the balance of purposes of performance evaluation from 
"Informational" (rated 1) to "Evaluative" (rated 10), C42 = A continuum that describes the 
balance of control of performance evaluation from "Internal" (rated 1) to "External" (rated 
10), 143 = A continuum that describes the balance of standards of performance evaluation 
from "Professional standards" (rated 1) to "Institutional standards" (rated 10), S44 = your own 
level of overall satisfaction with the existing process of performance evaluation at your own 
institution from "Very dissatisfied" (rated 1) to "Completely satisfied" (rated 10). 
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Figure 1. Locations of Some Performance Indicators in Three Dimensional Grid (Purpose, 
Control and Process): A Hypothetical Example 



PURPOSE 



EMANCIPATORY" 
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Figure 2. Diagrams of Rank-ordered Means and Factorial Groupings in the Admistrators, Teachers and Researchers Subsamples 
Corresponding to Tables 2 to 10 
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